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FOREWORD 


It  is  shown  in  this  report  that  rank  order  statistics 
and  symmetric  statistics  are  independently  distributed  under 
the  assumption  of  random  sampling.  Techniques  are  given 
that  are  particularly  useful  in  testing  certain  hypotheses 
concerning  stochastic  processes. 

This  research  was  carried  out  in  the  Statistical  Engi- 
neering Laboratory,  and  was  sponsored  in  part  by  the  U.  S. 
Naval  Ordnance  Test  Station,  Inyokern,  under  Contract 
NPJ*5 -507/14-3  with  the  National  Bureau  of  Standards.  The 
Statistical  Engineering  Laboratory  is  Section  11*3  of  the 
National  Applied  Mathematics  Laboratories  (Division  11  of  the 
National  Bureau  of  Standards),  and  is  concerned  with  the  de- 
velopment and  application  of  modern  statistical  methods  in 
the  physical  sciences  and  engineering. 
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I o Richard  Savage 

Summary s In  this  report  it  is  shown  that  under  fch®  hypothesis 
of  -randoi&aess  rank  order  statistics  and  trie  statistics 

are  independent©  Tills  fact  i©  ©f  use  in  t©  s ting . hypo  these  s 
as  is  shown  by  ©samples© 

In troduc ti on : A common  problem  in  statistical  inference  Is 

"Do  the  observations  x.»»*o x corn®  from  a population  centered 

In 

at  the  origin?”  On®  solution  to  this  problem  is  to  us® 
th©  ordinary  "t"  test  of  this  hypothesis 0 A more  careful 
observer  might  notice  that  there  Is  no  assumption  of  normality 
which  is  needed  in  order  to  justify  th©  us©  of  the  "t”  test» 
The  observer  might  then  Interpret  "centered  at  th®  origin" 
as  "median  at  th©  origin"  and  use  th®  sign  t@st9  Finally 
the-  observer  may  notice  that  there  is  no  assumption-  that 
the  observations  w©re  made  ©n  independently  and  idsntiealy 
distributed  random  variables?  The  objective  of  this  report 
is  fc©  show  that  many  of  th©  tests  of  randomness  (identical 
su'd  independent  random  variables)  are  independent  of  th© 
common  tests  of  hypotheses Q More  precisely  it  will  b© 
shown  that  if  one  has  random  samples  then  many  of  the  tests 
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of  randomness  will  be  independent  of  ordinary  tests  ©f 
hypothesise.  Or  under  the  null  hypothesis  of  random  sampling 
tests  of  randomness  ar®  independent  ©f  ordinary  tests  ©f 
hypotheses*  Using  this  fact  th©  statistician  will  b©  abl© 
to  test  both  th©  ’"Fandom”  and  TOp&ramsfcrie”  parts  ©f  a 
hypothesis  with  known  significance  levels* 

As  a further  example  on©  might  hare  x(t)  an  observation 
on  a Fundamental  Kandom  Process  [Maim  (1951^  ]Ve  On©  knows 
that  if  th©  null  hypothesis  is  tru©  then  th©  quantities 
x(10)  - x C(i»l)5]  ^isl9o«on)  form  a random  sample  from 
a normal  distribution  with  mean  gar®  and  variance  proportional 
to  5 o To  test  this  hypothesis  one  must  test  for  both  randomness 
and  normal! tyQ  %®  teat  for  randomness  would  depend  on  the 
alternatives  of  Interests  perhaps  ©n©  of  th©  mm  tests 
[Levon©  (19^2)  3 would  b©  found  suitable*  Th©  test  of  normality 
could  bo  performed  using  th©  classical  chi  square  g©odn@ss 
of  fit  test  with  one  parameter  estimated',*  If  this  w©r© 
th©  test  program  this  report  will  show  that  th©  run  test 
and  th©  chi  square  test  ar©  independent  under  th©  null 
hypothesis*  • ■ 

Assumptions  and  Notation?  W©  shall  b©  concerned  with  real 
valued  random  variables  or  x± j (depending  if  there  is  on© 

©r  several  samples p x being  th©  observation  in  - a* 

o 

Names'  Tsl  1 owed  by  dates  in  square  brackets  refer©  to  items 
in  the  bibliography  at  the  end  of  this  report* 
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sample, 


and  being  th©  observation  In  the  ilSil 


stable)*  W®  shall  assume  that  all  of  the  observations 
corn©  from  distributions  that  have  continuous  cummulativ© 
distribution  functlonaa  Letting  P(A)  stand  for  the  probability 
of  the  ©v@nt  A(P(A|B)  will  stand  for  feh®  conditional  probability 
of  the  ©vent  A given  that  the  ©vent  B has  ®©curr©d)0  the 
assumption  of  continuity  implies  § 

Cl)  P(si j*a)  » © 

(2)  p(j^  p p ©nd  either  i f i«  or  j f j#)  « 0 

A a tatl stic  1©  any  measureabl©  function  ©f  the  observations 
Cx^jlg,  and  will  b©  generally  denoted  t(x£|)A  In  som©  of 


Ll 


th©  applications  th©  statistic®  used  in  this  report  will 

actually  b©  vector  valued^,  and  consequently  all  ©f  th©  proof s ai9® 

given  in  th©  case  where  t(x  ) is  a vector®  In  writing  that 

■ t J 

a statistic  is  equal  to  a certain  value  or  less  than  or 
equal  t®  a certain  value  this  should  be  interpreted  In. terms 
of  the  components  of  the  statistic^®  In  defining  certain 
statistics  subsequently  we  will  not  include  any  values  if 
th©  ©vent  in  (2)  (this  event  will  b©  denoted  by  E)  above 
occurs 8 ^inc©  th©  ©vent  & has  probability  sero0  any  statistic 
which  is  not  defined  explicitly  when  this  ©vent  occurs  may 
b©  defined  arbitrarily*,  without  effect  th©  distribution 
of  th©  statistic® 

The  rank  of  n^jCin  the  i sample)  Is  a statistic 

JLlU 

©nd  equals  the  number  of  observations  in  the  sample" 
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whos©  values  ai»©  less  than  of  equal  to  that  of  Xj-jo 

A rank  order  statistic  Is  a statistic  which  la  a function 
only  of  the  ranks  (within  samples)  of  the  observations,  and 
will  be  denoted  by  Ho 

A symmetric  statist!©  Is  a symmetric  function  of  the 
observations  within  samples , and  will  b©  denoted  by  So 
She  "t"  statistic  for  two  samples  ii  m example  of 
a symmetric  statistic**  If  w©  had  several  samples  and  from 
©ash  of  them  formed  a run  statistic  ELevvne{l952) ]»  then 
any  function  of  th©  several  run  statistics  would  b©  a rank 
order  statistic*  In  particular  for  @n@  sample  th©  total 
number  of  runs  up  and  down  Is  a rank  order  statistic*, 

Lemmas  5 In  the  following  lemmas  w©  shall  need  the  sets 
A.p  defined  as  follows i (x^)  belongs  to  Aj  If 


x5^<  x$\°°a<  Hb@p®  (h>  Jn) 

is  a permutation  of  th©  first  n integers^  Then  it  is  clear 
that  there  a r©  n®  sets  A j®  and  that  they  are  all  disjoint® 
Further  th©  sum  ©f  th©  Bets  A and  th©  set  corresponding 
to  th®  ©vent  E[  t'eee  (2)  above J &r®  disjoint  and  together 
form  a n=>dim©nsi©nal  Euclidean  space  Er0 

Lemma  Is  P['R®rI  (x^)  belongs  to  Aj  j is  ®n©  or  &©r@  depending 
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on  If  there  is  at  least  on©  point  sfx'j)  in  A.|  such  that 

R (%Q  .| ) ® r or  note 

Proof s If  (xjjand  (x0^)  &r©  two  points  in  A ^ then  it  is 
clear  that  and  have  the  sam©  ranks  for  each  i §* 

2,000, Do  But  a rank  order  statistic  is  a function  of  tha 
ranks  only,  and  therefor©  any  rank  order  statistic  can  asauna 
only  on©  value  in  the  s©fc  Ajo  Prom  this  the  lemma  follows o 
Lemma  II?  Under  the  assumption  that  (x^)  is  a random 
sample  fro®  a continuous  distribution: 

PCCx*)  belong®  to  A,  3 ® l/n® 

Proof : Let  F(x)  be  th©  distribution  function*  then? 


P[|ki)  belong 


[Let  F(xf)  * y3  3 


® l/n0  Qo  E0  Do 
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Lemma  Ills  Under  th©  assumption  of  sampling  from  a distribution 

with  a continuous  distribution  fun@ii@ng 

P£S  < slfx^)  belongs  to  A $ ] s P(S  < s}..:  o 

Proof g Let  Sj  be  the  set  ©f  points  which  ar©  in  Aj  and  stseh 

, / 

that  S < 3n  Then  it  is  clear  that? 

P{sj)  ■ P C o j 'o ) 

H©nc©  using  Lemma  II  we  .have  that: 

P[S  % s|Cxj-)  belongs  to  Ajj 
PfS  f s and  C^|b©l©ngs  to  Aj] 

ITf^5T=B©I@I55^5'^^ 

s Pfsj^  n«  - 

*L  e p(s 
"T  j 

S pCs  < »)  Qo  Ko  Do 

Fundamental  Result : In  this  section  w©  ©tat©  and  prove 

a theorem  wh5  eh  is  immediately  useful  for  th©  problem  about 
th©  Fundamental  Random  Process  mentioned  in  th®  introduc tionn 

Theorem  I?  In  samples  of  fixed  siged  of  random 

& 

variables  (x^ ) which  are  independently  and  identically  dis^ 
fcributed  any  rank  order  statistic  and  ©ay  symmetric  statistic 

are  independently  distributedo 
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Proof?  Clearly  any  particular  rank  ©pdar  statistic  R 
c&n  take  on  at  most  n9  values  and  therefore  has  a dig ©pet© 

Q 

dis tribute ©Bo  And  thus  w®  perform  th©  following  analysis  to 
show  th©  independence  of  R and  Ss 
P(R  « p and  S < s) 

ca 

* P[R  * r and  S < s and  (x^)  belongs  to  E] 

<#■  p[R  s p ©nd  S < s and  (x. ) does  not  belong  to  E] 

a PfR  » r and  S < s and  (x^)  belongs  to  the  sus  of  th©  Aj°e] 

s g p[R  » r and  S < s and  (x  ) belongs  A 3 

j i i 

“ 2 PfS  « s|r  ■ £ and  (x  ) belongs  ts  A.J 

j <=>  ^ 1 J 

P[R  sb  r and  (x. ) belongs  to  AJ 
1 J 

(Now  by  Lesaraa  I) 

s SFfS  «g  s|  (x»)  belongs  to  A J Q 

J **  J 

P[R  gs  p and  (x* ) belongs  t©  Aj] 

(Now  by  L©E®a  III) 

«*  PCS  < a)  X PfR  ■ r and  (x.)  belongs  t@  A^] 

j 

® P(S  < s)P(R  = p)  Q0  ISo  So 
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Applications  of  Tha©r@m  Is  (1)  W©  wish  to  test  the  hypothesis 
that  observations  Sp6,„xg  Croad©  in  that  order)  ©om©  from 
a distribution  with  median  g©r@?  Th®  p@ss5.bl©  alternatives 
&r©  that  the  median  is  greater  than  sero0  or-  that  the  median 
is  shifting  toward  larger  values  as  the  successive  observations 
are  mad©0  To  test  this  hypothesis  it  would  be  appropriate  to 
us©  the  sign  test  statistic  [MacStewart  (1941)  h>  and  the  [Mann 
(1945)1  statistic  for  trendo  It  is  clear  that  large  values 
of  either  ©f  these  statistics  should  b®  used  to  reject  the 
null  hypothesise  Th©  sign  test  statistic  is  a symmetric 
statistic  and  th©  Mann  statistic  is  a rank  order  statistic 
so  clearly  Theorem  I is  applicable  © 

In  order  to  make  a test  ©f  significance  at  th©  a level 9 
using  these  two  statistics^  there  sr©  available  two  standard 
techniques o W©  may  use  th©  x technique  for  combining 
testa  of  significances  An  ex tensive  analysis  ©f  this  method 
can  b©  found  in  [Tallis  (1942)1  see  also  [Fisher  (1925)]* 

®i©  other  technique  is  to  ehoos©  levels  of  significance 

and  ©g  for  th©  two  tests  based  on  the  two  statistics*.  Then 

th©  probability  of  rejecting  th©  null  hypothesis  (when  true) 
by  at  least  on©  of  th©  test  is  1 - (l»a^)  (l-ag)  88 

- ®j_®2  o &Inc©  and  are  at  our  disposal  w©  can  usually 

pick  a desired  combination  of  them  so  that  ai  ♦ a2  ° al®2 
“ aQ  These  techniques  for  combining  independent  tests  can 
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of  ©ours©  to®  used  in  all  ©f  the  remaining  example®,,  When 

p 

there  are  more  than  two  independent  tests  the  x method  is 
still  appli cable » and  in  the  other  technique  we  pick  levels 
©f  significance  (snj  there  are  e independent 

tests) o Then  the  probability  that  the  null  hypothesis  will 

© 

be  rejected  (when  true)  is  1 TT(l~®j_)  and  w©  ©an  usually 

i»le 

choose  the  a,  such  that  a ■ 1 - TTU-a, » . 

1=1 

(2)  W©  v<?Ish  to  test  feh©  hypothesis  that  x(  t)  is  an 

observation  from  a fundamental  random  process 9 and  the 

alternatives  an©  that  x(16)  - x[ (1=1)63 Ci^l® oo agn)  bf®  ©liber 

not  normally  distributed  or  are  autocorrelatedo  Th©  alternative 

of  not  being  normally  distributed  may  foe  test®4‘by  using  ‘a 

X g@odness=@f=fi t test  on  the  variables  x(i©)  - z[(i->l)0]9 

after  estimating  the  eommon  variance  of  th©  random  variable® 

x{16)  ® x[(i-l)5]o  'This  is  a symmetric  statist!©*,  Th© 

autocorrelation  may  be  tested  by  using  th©  rank  order 

autocorrelation  coefficients  [see  Noether  (195®-)]°  These 

are  clearly  rank  order  statistics..  Thus  Theorem  I assures 

2 

us  that  under  the  null  hypothesis  th®  % statistic  is 
independent  of  th©  rank  order  autocorrelation  coefficients^ 
Several  Samples s In  this  section  we  shall  state  and  prove 


a theorem  analagous  to  Theorem  Iff  applicable  to  several  samples 0 
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Th®o reai  II z Assume  that  the  set  of  random  variables 
('.&i  jH  * o p Mjocg^H)  are  independently  distributed 
and  identically  distributed  for  fixed  i (itb  sample^  then  any 
rank  order  statistic  of  the  N samples  is  independent  of  any 
symmetric  statistic  of  the  N samples^ 

N 

Proofs  The  sample  space  is  again  the  ^uclidaan  n(n  ® % a?  ) 

1*1 


space  E 


n 


Ea  can  be  written  as  the  product  ©f  N Euclidean 


spaces  of  dimension  that  is 

U 

E - TT  En 

i-1  i 

Further  each  can  be  decomposed  into  sets  Ex  & A>  ^ a 

Ci”lp  o o o^Ng,  3 o o o ph| 0 ) the  sets  E1  and  At  » being  analogous 

4 ‘ 


to  the  sets  K and  A » of  the  lemmas,-, 

E’  * TTC  2 A.1  * E1) 

B 3 


Thus 


which  can  be  written  asf 

TT^' 

kh  Bk  * E * 

Her©  E is  the  union  ©f  all  of  th@s©  sets  in  as  expressed 

in  the  above  product  which  contain  at  least  on©  E'H  That  is 
the  set  of 

E is/th©se  points  in  with  at  least  two  coordinates  equal 

n N 

within  & sampler.  Bj^ar©  th©  Tf  0 disjoint  sets  of  the 
N 

form  TT  A,  i 

1^1  <?!  » that  is  no  ties  within  samples0 
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It  Is  clear  by  the  continuity  assumptions  that? 

P{K')  * Qo 

Than  s 

PCS  « s and  R « r) 

s 2PfS  < s and  R ® r and  (x«  A belongs  to  B.J 

^ " * j K 

♦ P|S  < s and  F?  s r and  (x.$  j)  belongs  t©  S] 

« EPCS  < s and  B s f>  and  (x^  .J  belongs  fc© 

k 

® SP|S  < 8 1 (Xq  J belongs  to  B^end  R * -p-J 
k 

P[ (xi belongs  to  and  R s r] 

■ PCs  5 s)  P(K  » r)  Qo  Eo  d« 

Applications  of  Theorems  Cl)  Assume  w©  have  an  observation 
on  & stochastic  process  and  wish  to  test  the  hypothesis  that 
the  increments  ©p©  independent  and  identically  distributed 0 
The  alternatives  we  wish  to  guard  against  are  such  things' 
as  autocorrelation  and  that  xCt)  has  some  marked  change  at 
isome  time  t [toay  x(t)  is  displacement  in  the  horizontal  ©£  a 

shell  and  is  the  approximate  time  at  which  the  shell 
stops  rising  and  start®  descending]®  Let  ® x( 

- x[t..*(i»l)G]  and  i ® *(n»4)»o.#»n  o Then  the  null 

at  ==►  ^ 

hypothesis  implies  that  y17 ^ » y^f  i^Cn^-dL*,  o <■> o 90  3 and 
y^ (i*l* o 3 from  two  samples  of  Identical  and  independent 


f 


, 

MX 


observations*,  We  right  use  the  [Kolmogoraff  (19Ul)3  goodness- 
of~fit  test  to  see  that  the  observations  from  both  samples 
come  from,  the  same  distribution  o lie t the  statistic  used  b ® 
called  We  can  test  autocorrelation  by  computing  statistics 
R®  and  R which  are  the  rank  crdei*  autocorrelation  coefficients 
from  the  fix#st  and  second  samples  reapec tiveljo  The  vector 
(R|H)  corresponds  to  the  quantity  R ©f  the  theorem®  and  since 
S is  symmetric  w©  have  under  the  null  hypothesis  that  (R 0 „R ' )) 
and  S ar®  independent *>  Also  of  course  under  the  null  hypothesis 
R9  and  H are  independents 

(2)  n litters  of  animals  Cn^in  th©  ith  litter)  ©re  to 

b©  used  for  an  experiment*,  Preliminary  to  the  experiment  w® 
wish  t©  test  that  average  weights  of  the  litters  are  the  same-. 
The  mil  hypothesis  is  that  all  ©f  th©  weights  &r©  independent® 
normal  with  same  mean  and  variance <>  As  alternatives  we  shall 
be  eoreerned  if  there  i s a birtb  order  effect  within  litters 
or  that  th©  litters  have  different  average  weights 0 The  P 
test  cf  analysis  ©f  variance  will  test  the  homogeneous  nature 
of  the  means®  and  th®  total*'-1,  number  of  rune  up  and  down  in 
th®  ith  litter  will  be  a tost  ©f  birth  order  * It  is  clear 
that  the  F is  & symmetric  and  the  vector  CFh)  is  a rank  order 
statistic  and  therefor©  fch©  theorem  is  applicable*. 
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Power i It  is  hard  to  describe  the  power  functions  of  tests 
based  on  several  statistics,  when  the  severs!  statistics 
do  not  regain  independent^  When  the  testa  are  independents 
even  under  the  alternative  hypotheses  it  is  possible  to  find 
the  power  function  for  each  test  separately  and  then  of  coups--* 
it  is  possible  to  find  the  _ -power  function  in  most  eases  fit 
is  trivial  if  we  combine  the  feast  results  by  the  second  method 
discussed  above-),,. 

Af  the  random  variables  do  remain  independent  and 
Identically  distributed®  then  even  though  the  parame tri© 
portion  of  th©  null  hypothesis  may  be  false®  th®  statistics 
used  to  feast  the  “random34  end  "parame trie”  parts  of  th©  null 
hypothesis  do  remain  independent- 

general,  iiati one  § Th©  results  ©f  this-  report  m ay  be  extended 
to  fch©  ©ass  where  the  observations  instead  of  being  rea5U 
valued  ars  simply  points  in  any  abstract  spac@o  In  this  case 
th©  definition  of  symmetric  statistic  is  not  affected  0 
However®  rank  order  statistics  must  b©  r@defin©d0  On®  way 
of  doing  this  is  to  form  a real  valued  functional  of  th© 
observations®  and  define  the  rank  order  statistics  in  terms 
of  the  values  of  this  functional « If  this  is  dona  in  such 
a manner  that  the  set  E still  remain  of  measure  &©r©  then 
th©  theorems  of  th©  report  will  be  applicable  t©  this  more 
general  system- 
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